Modeling the mixtures of known noise and unknown unexpected noise for robust speech recognition
نویسندگان
چکیده
Real-world noise may be a mixture of known or trainable noise and unknown unexpected noise. This paper investigates the combination of the conventional noise-reduction techniques with the probabilistic union model to deal with this type of mixed noise for robust speech recognition. In particular, we have developed a multi-environment system to remove the known or trainable acoustic mismatch across different environments. The novelty of this system, in contrast to other multi-environment models, is that the acoustic model for each environment is built upon the probabilistic union model, so that this system is also capable of accommodating further unknown unexpected noise within a specific environment. We have tested the new system for connected digit recognition in different environments, each involving an environment-specific noise and some unknown untrained noise. The results indicate that the new system offers significantly improved performance for the environments involving unknown additional noise, in comparison to a baseline multi-environment system.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملRobust Identification of Smart Foam Using Set Mem-bership Estimation in A Model Error Modeling Frame-work
The aim of this paper is robust identification of smart foam, as an electroacoustic transducer, considering unmodeled dynamics due to nonlinearities in behaviour at low frequencies and measurement noise at high frequencies as existent uncertainties. Set membership estimation combined with model error modelling technique is used where the approach is based on worst case scenario with unknown but...
متن کاملRobust speech recognition via modeling spectral coefficients with HMM's with complex Gaussian components
Robust speech recognition via hidden Markov modeling of spectral vectors is studied in this paper. The hidden Markov model (HMM) mixture components are assumed complex Gaussian with zero mean, diagonal covariance, and with incorporating an unknown scalar gain term. The gain term is associated with each spectral vector and it models the varying energy of speech signals. It is estimated by applyi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001